The text extraction feature allows you to pull out the text from within a PDF document. Text can be extracted from an entire PDF document (using the GetText method of the PDFDocument class) or from within a certain page of a PDF (using the GetText method of the PdfPage class). The text returned from the GetText method is a string.
There are a couple of things to keep in mind when using the GetText method for extracting text from within a PDF:
The following code will extract the text in an existing PDF document.
[Java] // Create the PDF document object
PdfDocument pdfA = new PdfDocument( "[PhysicalPath]/MyDocument.pdf");
// Call the GetText method from PDF document object to get the text from the document
String extractedText = pdfA.getText();
The following code will extract the text from a specified page within a PDF.
[Java] // Create the PDF document object
PdfDocument pdfA = new PdfDocument( "[PhysicalPath]/MyDocument.pdf");
// Call the GetText method a PDF page to get the text from that page
String extractedText = pdfA.getPages().getPdfPage(1).getText();